AITopics | exp null 1 2

We develop a deeper understanding of its relationship with more conventional analyses of extragradient methods [Nem04, Nes07]. We also give improved solvers for the subproblems required by variants of the [She17] algorithm, designed through the lens of relative smoothness [BBT17, LFN18].

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

9d87a0c38431d0ec8d8b8ece95198c04-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 04:05:41 GMT

artificial intelligence, machine learning, path consistency, (19 more...)

Neural Information Processing Systems

Country: Asia > China > Beijing > Beijing (0.04)

Industry: Leisure & Entertainment > Games (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BMRS: Bayesian Model Reduction for Structured Pruning

Neural Information Processing SystemsFeb-15-2026, 21:50:59 GMT

In this work, we address this challenge by proposing BMRS: B ayesian M odel R eduction for S tructured pruning.

artificial intelligence, machine learning, pruning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(3 more...)

Add feedback

A Proofs of Linear Case Throughout the appendix, for ease of notation, we overload the definition of the function d

Neural Information Processing SystemsFeb-15-2026, 16:18:47 GMT

The proof of this lemma requires Lemma A.1, which characterizes the distribution of the residual By Pinsker's inequality, this implies d By Lemma A.1, we have E[ X ( null w w The proof is inspired by Theorem 11.2 in [20], with modifications to our setting. First, we construct a "ghost" dataset The most challenging aspect of the ReLU setting is that we do not have an expression for the TV suffered by the MLE, such as Lemma 4.2 in the linear case. The proof of this Lemma, as well as other Lemmas in this section, can be found in Appendix B.1. Using Lemma B.2 and Lemma B.3, we can form a uniform bound, such that all A straight forward combination of Lemma 4.3 and Lemma B.4 gives the following Theorem. Now we can apply Bernstein's inequality (Theorem 2.10 of [8]).

artificial intelligence, machine learning, partition, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Appendix A Theory

Neural Information Processing SystemsFeb-14-2026, 16:24:51 GMT

In this section, we show the proofs of the results in the main body. Eq. (1) satisfies the triangle inequality, i.e., for any scoring functions For the second inequality, we prove it similarly. Before we present the proof of the theorem, we first provide some lemmas. By applying Lemma A.2, the following holds with probability at least 1 α: null R F). Thus we have: null R A.1, we can get that the margin loss satisfies the triangle inequality. By Lemma A.4, we have R By Theorem 4.4, the following holds for any Based on Theorem A.6, the following standard error bound for gradual AST can be derived similarly to Corollary 4.6.

artificial intelligence, denote, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

High-Probability Minimax Adaptive Estimation in Besov Spaces via Online-to-Batch

Liautaud, Paul, Gaillard, Pierre, Wintenberger, Olivier

arXiv.org Machine LearningFeb-13-2026

We study nonparametric regression over Besov spaces from noisy observations under sub-exponential noise, aiming to achieve minimax-optimal guarantees on the integrated squared error that hold with high probability and adapt to the unknown noise level. To this end, we propose a wavelet-based online learning algorithm that dynamically adjusts to the observed gradient noise by adaptively clipping it at an appropriate level, eliminating the need to tune parameters such as the noise variance or gradient bounds. As a by-product of our analysis, we derive high-probability adaptive regret bounds that scale with the $\ell_1$-norm of the competitor. Finally, in the batch statistical setting, we obtain adaptive and minimax-optimal estimation rates for Besov spaces via a refined online-to-batch conversion. This approach carefully exploits the structure of the squared loss in combination with self-normalized concentration inequalities.

artificial intelligence, machine learning, sup 1nullt, (19 more...)

arXiv.org Machine Learning

2602.11747

Country: